CIE4604 - Simulation and Visualisation

Assignment 3 - Time Series Analysis of Vegetation Index


Student Name(s): Pratyush Kumar
Student Number(s): 5359252

In this assignment we do a time series analysis of NDVI using the HANTS (Harmonic Analysis of Time Series) algorithm and visualize the results in Python and QGIS.

For a description of the assignment consult the CIE4604-A3-NDVI_HANTS.pdf document.

Part I - Compute and Visualize NDVI in QGIS

If you prefer, you may add the QGIS results of part I here. See CIE4604-A3-NDVI_HANTS.pdf a detailed description of the actions.

In your report (or here) include at least the following results:

Furthermore, include in the zip file accompanying the report,

Answer

For extraction of the raster images by the extent defined by the shapefile, i did the following:

  1. reproject the bounding area to UTM Zone 30N
  2. used GDAL to crop out the extent of the images as raster using the reprojected vector layer
  3. used this cropped extent to extract the Raster images using GDAL's clip raster by mask layer
  4. After this the raster calculator was used to produce the NDVI.

The colormap used for this purpose, is a diverging colormap with shades of brown for values below 0 and shades of green for values above 0. This was chosen also in a color blind freindly version of brown and green so that the maps are visible to everyone as intended. Moreover, the color scheme was chosen such that the greener pixel is indicative of higher NDVI value.

The GDAL command to make raster masks were done as follows img

The image below is of the layout formed using QGIS: img

The image below is of the attribute table exported as an xlsx file: img

Part II - NDVI time series analysis with HANTS

HANTS is a software using the Fourier Analysis in order to detect outliers (clouds) and reconstruct image time series.
The analysis is performed over Sector BXII, an agricultural area located close to Sevilla (Spain). Input data files for this script are:

Optional (non-required) input file(s) are:

The actual HANTS algorithm is implemented in the Python function hants.py, see also help(hants.py).

This script is a template that you have to modify in places to obtain the desired results. Enter your answers as markdown.

Load NDVI for Sector BXII, Sevilla, Spain

The input files are:

The code for reading these dataset was already developed in Exercise 4. The following variables are created

You should definitely take a look at each variable to see what it is about.

Colormap selection

To plot the NDVI for the first day we are going to use matplotlib's imshow method to make a pseudo-color map. A good colormap for the NDVI is essential! It is up to you to come up with a good colormap for the NDVI, taking into account the lessons learned from Exercise 1 and 2. You may also use your answer from Exercise 4 here (this is a repetition).

Start by formulating the requirements for the color map. What kind of color map should we select? What is the range? Do we use a linear map, is it continuous or discrete, do we use two tones? What kind of normalization are we going to use? In any case, the default color map is certainly not a good choice. Why?

Have a look for instance at the following article https://publiclab.org/notes/cfastie/08-26-2014/new-ndvi-colormap to get some inspiration.

Explain here your considerations for the colormap and the reasons for choice in markdown

In the section below set the cmap variable with the chosen colormap.

In Python you can read the text file or mat file and create the colormap using matplotlib.colors.ListedColormap. See also Exercise 1 and 2.

cmapTable = scipy.io.loadmat('file.mat')['colors']
cmap = colors.ListedColormap(data/255)

You can then declare this new colormap in the imshow function by declaring the input as cmap = cmap

Pseudo color image plot of unfiltered NDVI for the first day

To plot the NDVI for the first day we are going to use matplotlib's imshow method to make a pseudo-color map. Use the colormap that you selected above using the cmap = cmap optional parameter in imshow, together with 'clim' and 'extent'.

The optional extent parameter is used to set the proper x- and y-axis coordinates. It is computed from the image bounds, while converting meter units to km.

Create a video with the unfiltered NDVI data

A video is created showing the (unfiltered) NDVI data. We use imshow to display matrix NDVI(i,:,:) as an image for day i, using the colormap of your choice (see previous section).

The figure statement is called only once, imshow is called from within the loop over days, and the output is appended to the list of frames.

There are two ways to display the animation from a notebook,

You can also save the animation as either a mpeg file or an animated gif (both are commented out at the moment). The code is borrowed from Exercise 4.

The colormap used in the above animation and the plot is a diverging colormap from red to green. It has been chosen to portray that the greener the pixel is, the more NDVI value it has and vice versa with respect to the reds in the map.

Compute reconstructed time series using HANTS

Inputs for HANTS are the times ts (days) and the NDVI time series for one pixel. For the other input parameters see the help that comes with HANTS.

Based on the help information define your HANTS input parameters below. Some are preset, others are left open.

Set the line and sample number of the pixel for which to apply HANTS.

We will show you later how to find the line and sample numbers for specific pixel areas, e.g. one of the field in the training set.

Do HANTS, first we have to extract the pixel data into the array datain. We use the Matlab function squeeze to do this, because NDVI is stored as a three dimensional array and we have to collect all values that belong to a single pixel. We use squeeze because NDVI[:,row,col] does not return an array, but a list, and squeeze will convert this list to an array.

HANTS Outputs are:

  amp   = array of amplitudes, first element is the average of the curve
  phi   = array of phases, first element is zero
  dataout   = array holding reconstructed time series 
  outliers  = boolean array with the outliers 

Plot the original vs reconstructed time series

Plot HANTS for different HANTS params

this will help us to iterate through different values of frequencies and days

Plotting for different vegetation pixels

Select suitable HANTS input parameters

Use the above code as a template to select a suitable set of input parameters. You can try different values for the number of frequencies (nf) , base period (nb) and fit error tolerance.

You should at least

In the report, explain your choice for the base period (you don't have to show plots, there is only one good value for the base period), and include the plots for the chosen parameters.

Base period: It is chosen as 371 since the 33 NDVI images are over the period of 371 days.
nf: 3
fet: 0.1
From the above plots, it is decided to take nf value as 3 and the tolerance set at 0.1, since at nf=4 we start seeing that the curve almost fits the highest values whereas on increasing the nf value further it is observed that the curve can be called as over-fitting the values.

Apply HANTS for all pixels

Apply the HANTS algorithm for all pixels, and save the amplitude, phase and reconstucted NDVI.

THIS IS A VERY TIME CONSUMING PROCESS THAT MAY TAKE 15-30 MINUTES! SO ONLY DO THIS WHEN YOUR ARE SURE YOU GOT THE CORRECT PARAMETERS, OR TEST IT FIRST ON A SMALLER SAMPLE.

Save the results now!!

Variables can be saved and loaded with the Spyder IDE. Do that for the newly computed results, so that for the follow on work you don't have to redo the time consuming processing every time. Here we will make use of pickle library to store the variables so that we can always reload them later.

If the variables have to be reloaded you can use the following code.

Write GeoTIFF files with reconstructed NDVI, amplitude and phase

Three new GeoTIFF files are created, with

These files can easily be imported by other software such as QGIS.

Create a video of the reconstruced NDVI data

Create a video showing the reconstructed NDVI data.

Instead of an animation we can also create a lot of subplots using the code below.

Plot amplitude and phase from HANTS (Part 2.D)

Make a figure with in the subplots a pseudo color image of the amplitude and phase for each frequency. Please think carefully about the colormap to use.

Which colormap do you use for the amplitude. Should the colormap for the phase be the same, or different. Please explain why.

Plotting cloudy vs reconstructed NDVI from a cloudy day (Part 2.E)

28 November 2017 was significantly cloudy, so we will use that image, from the dataframe we can determine that the index for this image is 22

Analyze the training dataset

The training dataset is a shapefile with information on the actual crop that is growing in the fields. The original training dataset was in UTM zone 29N. We used QGIS to convert this to UTM zone 30N, which was is also the zone that is used for the NDVI data.

First we must read the shape file with training data

training = shapefile.Reader("Training_set/Training_2017_UTM30N_WGS84_SP.shp")

The training dataset consists of 15 polygons. We will create an plot with the amplitude data and overlay the polygon boundaries.

Part III - Visualize HANTS output in QGIS

If you prefer, you may add the QGIS results of part III here. See CIE4604-A3-NDVI_HANTS.pdf a detailed description of the actions.

The amplitude obtained from the reconstruction of the images was loaded into QGIS and the amplitudes stored in bands 1,3,4 were mapped to the R,G,B bands on the visible image.

The image finally looks like the following:

img

[End of the assignment]